Search CORE

11 research outputs found

Bridging the Gap Between Ontology and Lexicon via Class-Specific Association Rules Mined from a Loosely-Parallel Text-Data Corpus

Author: Cimiano Philipp
Elahi Mohammad Fazleh
Ell Basil
Publication venue: OASIcs - OpenAccess Series in Informatics. 3rd Conference on Language, Data and Knowledge (LDK 2021)
Publication date: 01/01/2021
Field of study

There is a well-known lexical gap between content expressed in the form of natural language (NL) texts and content stored in an RDF knowledge base (KB). For tasks such as Information Extraction (IE), this gap needs to be bridged from NL to KB, so that facts extracted from text can be represented in RDF and can then be added to an RDF KB. For tasks such as Natural Language Generation, this gap needs to be bridged from KB to NL, so that facts stored in an RDF KB can be verbalized and read by humans. In this paper we propose LexExMachina, a new methodology that induces correspondences between lexical elements and KB elements by mining class-specific association rules. As an example of such an association rule, consider the rule that predicts that if the text about a person contains the token "Greek", then this person has the relation nationality to the entity Greece. Another rule predicts that if the text about a settlement contains the token "Greek", then this settlement has the relation country to the entity Greece. Such a rule can help in question answering, as it maps an adjective to the relevant KB terms, and it can help in information extraction from text. We propose and empirically investigate a set of 20 types of class-specific association rules together with different interestingness measures to rank them. We apply our method on a loosely-parallel text-data corpus that consists of data from DBpedia and texts from Wikipedia, and evaluate and provide empirical evidence for the utility of the rules for Question Answering

Dagstuhl Research Online Publication Server

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Terme-a-LLOD: Simplifying the Conversion and Hosting of Terminological Resources as Linked Data

Author: Cimiano Philipp
di Buono Maria Pia
Elahi Mohammad Fazleh
Grimm Frank
Publication venue
Publication date: 01/01/2020
Field of study

Università degli Studi di Napoli L'Orientale: CINECA IRIS

Recent developments for the linguistic linked open data infrastructure

Author: Chiarcos Christian
Cimiano Philipp
Cooney Katharine
Declerck Thierry
Elahi Mohammad Fazleh
Fäth Christian
Gonzalez Meritxell
Gracia Jorge
Hartung Matthias
Khvalchik Maria
Lanau-Coronas Marta
Lee Deirdre
McCrae John Philip
Montiel-Ponsoda Elena
Nasir Jamal Abdul
Orlikowsk Matthias
Racioppa Stefania
Revenko Artem
Rico Mariano
Saurí Roser
Publication venue
Publication date: 25/04/2023
Field of study

In this paper we describe the contributions made by the European H2020 project “Pret-a-LLOD” (‘Ready-to-use Multilingual Linked Language Data for Knowledge Services across Sectors’) to the further development of the Linguistic Linked Open Data (LLOD) infrastructure. Pret-a-LLOD aims to develop a new methodology for building data value chains applicable to a wide range of sectors and applications and based around language resources and language technologies that can be integrated by means of semantic technologies. We describe the methods implemented for increasing the number of language data sets in the LLOD. We also present the approach for ensuring interoperability and for porting LLOD data sets and services to other infrastructures, as well as the contribution of the projects to existing standards

OPUS Augsburg

LexExMachinaQA: A framework for the automatic induction of ontology lexica for Question Answering over Linked Data

Author: Cimiano Philipp
Elahi Mohammad Fazleh
Ell Basil
Publication venue
Publication date: 01/01/2023
Field of study

Elahi MF, Ell B, Cimiano P. LexExMachinaQA: A framework for the automatic induction of ontology lexica for Question Answering over Linked Data. Presented at the LDK, Wien.An open issue for Semantic Question Answering Systems is bridging the so called lexical gap, referring to the fact that the vocabulary used by users in framing a question needs to be interpreted with respect to the logical vocabulary used in the data model of a given knowledge base or knowledge graph. Building on previous work to automatically induce ontology lexica from language corpora by using association rules to identify correspondences between lexical elements on the one hand and ontological vocabulary elements on the other, in this paper we propose LexExMachinaQA, a framework allowing us to evaluate the impact of automatically induced lexicalizations in terms of alleviating the lexical gap in QA systems. Our framework combines the LexExMachina approach (Ell et al., 2021) for lexicon induction with the QueGG system proposed by Benz et al. (Benz et al., 2020) that relies on grammars automatically generated from ontology lexica to parse questions into SPARQL. We show that automatically induced lexica yield a decent performance i.t.o.

F_1

measure with respect to the QLAD-7 dataset, representing a 34\% - 56\% performance degradation with respect to a manually created lexicon. While these results show that the fully automatic creation of lexica for QA systems is not yet feasible, the method could certainly be used to bootstrap the creation of a lexicon in a semi-automatic manner, thus having the potential to significantly reduce the human effort involved

Publications at Bielefeld University

Bridging the gap between Ontology and Lexicon via Class-specific Association Rules Mined from a Loosely-Parallel Text-Data Corpus

Author: Cimiano Philipp
Elahi Mohammad Fazleh
Ell Basil
Publication venue
Publication date: 01/01/2021
Field of study

Ell B, Elahi MF, Cimiano P. Bridging the gap between Ontology and Lexicon via Class-specific Association Rules Mined from a Loosely-Parallel Text-Data Corpus. Presented at the LDK 2021 – 3rd Conference on Language, Data and Knowledge, Zaragoza, Spain.There is a well-known lexical gap between content expressed in the form of natural language (NL) texts and content stored in an RDF knowledge base (KB). For tasks such as Information Extraction (IE), this gap needs to be bridged from NL to KB, so that facts extracted from text can be represented in RDF and can then be added to an RDF KB. For tasks such as Natural Language Generation, this gap needs to be bridged from KB to NL, so that facts stored in an RDF KB can be verbalized and read by humans. In this paper we propose LexExMachina, a new methodology that induces correspondences between lexical elements and KB elements by mining class-specific association rules. As an example of such an association rule, consider the rule that predicts that if the text about a person contains the token 'Greek', then this person has the relation 'nationality' to the entity 'Greece'. Another rule predicts that if the text about a 'settlement' contains the token 'Greek', then this settlement has the relation 'country' to the entity 'Greece'. Such a rule can help in question answering, as it maps an adjective to the relevant KB terms, and it can help in information extraction from text. We propose and empirically investigate a set of 20 types of class-specific association rules together with different interestingness measures to rank them. We apply our method on a loosely-parallel text-data corpus that consists of data from DBpedia and texts from Wikipedia, and evaluate and provide empirical evidence for the utility of the rules for Question Answering

Publications at Bielefeld University

Bridging the LAPPS Grid and CLARIN

Author: Elahi Mohammad Fazleh
Hinrichs Erhard
Hinrichs Marie
Publication venue
Publication date: 01/01/2018
Field of study

Publikationsserver der Universität Tübingen

Terme-a-LLOD: Simplifying the Conversion and Hosting of Terminological Resources as Linked Data

Author: Cimiano Philipp
di Buono Maria Pia
Elahi Mohammad Fazleh
Grimm Frank
Publication venue: European Language Resources Association (ELRA)
Publication date: 01/01/2020
Field of study

In recent years, there has been increasing interest in publishing lexicographic and terminological resources as linked data. The benefit of using linked data technologies to publish terminologies is that terminologies can be linked to each other, thus creating a cloud of linked terminologies that cross domains, languages and that support advanced applications that do not work with single terminologies but can exploit multiple terminologies seamlessly. We present Terme-a-LLOD (TAL), a new paradigm for transforming and publishing ` terminologies as linked data which relies on a virtualization approach. The approach rests on a preconfigured virtual image of a server that can be downloaded and installed. We describe our approach to simplifying the transformation and hosting of terminological resources in the remainder of this paper. We provide a proof-of-concept for this paradigm showing how to apply it to the conversion of the well-known IATE terminology as well as to various smaller terminologies. Further, we discuss how the implementation of our paradigm can be integrated into existing NLP service infrastructures that rely on virtualization technology. While we apply this paradigm to the transformation and hosting of terminologies as linked data, the paradigm can be applied to any other resource format as well

ARCHIVIO ISTITUZIONALE DELLA RICERCA-UNIVERSITA' DEGLI STUDI DI NAPOLI "L'ORIENTALE"

An Italian Question Answering System Based on Grammars Automatically Generated from Ontology Lexica

Author: Basil Ell
Gennaro Nolano
Maria Pia di Buono
Mohammad Fazleh Elahi
Philipp Cimiano
Publication venue: place:Trento
Publication date: 01/01/2021
Field of study

The paper presents an Italian question answering system over linked data. We use a model-based approach to question answering based on an ontology lexicon in lemon format. The system exploits an automatically generated lexicalized grammar that can then be used to interpret and transform questions into SPARQL queries. We apply the approach for the Italian language and implement a question answering system that can answer more than 1.6 million questions over the DBpedia knowledge graph

ARCHIVIO ISTITUZIONALE DELLA RICERCA-UNIVERSITA' DEGLI STUDI DI NAPOLI "L'ORIENTALE"

Università degli Studi di Napoli L'Orientale: CINECA IRIS

A New Pseudo-automatic Outer Lip Contour Extraction Approach Based on RGB Components

Author: Faisal Mahbub Chowdhury
Md. Aminul Islam
Md. Mahbubur Rahman
Mohammad Fazleh Elahi
Rajib Kumar Rana
Publication venue: Assumption University, Thailand
Publication date
Field of study

Detection and tracking of the lip contour is an important issue in speech reading. While there are solutions for lip tracking once a good contour initialization in the first frame is available, the problem of finding such a good initialization is not yet solved automatically, but done manually. Solutions based on edge detection and tracking have failed when applied on real world mouth images. In this paper we propose a solution to the lip contour detection that minimizes the user interaction. A minimal number of points to be manually marked on the mouth image are required initially as reference points. The method is based on the examination of values of the RGB components of the outer surface of the region enclosed by the outer lip contour. The paper also includes the limitations of the other existing approaches

Archivio della ricerca - Fondazione Bruno Kessler